Identifying bioactivity events of small molecules from the scientific literature

نویسنده

  • Yan Ying
چکیده

The PhD work focused on the automatic extraction of small molecule bioactivity information from scientific literature, a topic that ultimately aims to contribute to the small molecule ontology ChEBI. The ChEBI ontology plays an important role in assisting biomedical research, but requires further expansion in order to improve the ability to interrelate and link ChEBI to other primary biological resources. Small molecule bioactivity is a concept that presents small molecule characteristics in scenarios involving other biological objects. Bioactivity effects can have various targets including genes, proteins, organs, and complete organisms. Two methods of extracting these events were examined. In the first, a supervised machine learning approach was proposed and developed based on a new corpus: DrugPro. The developed system comprised cascaded classifiers and a post-processing unit to extract the specific relationship between drugs and bacteria. The classification system delivered a good result when combining domain-specific features and a term weighting scheme, achieving precision of 92.56% and recall of 93.22% on the test corpus. A sentence classification system also achieved reasonable results, with precision of 58.9% and recall of 57.4% on the test corpus. A small set of language patterns was developed to extract the key items of information from the positive sentences. The second method of extracting small molecule bioactivity events involved the development of an entirely rule-based extraction system. The performance of the system was evaluated on a random selection of 20 abstracts, which resulted in 94.7% precision and 94.7% recall on this (admittedly limited) data set. The results indicated the rule-based approach has the potential to precisely identify general small molecule bioactivity events in a much larger collection of biomedical text. It is concluded that extension and enhancement of the ChEBI ontology could be based on the concept of small molecule bioactivity, and text mining provides a viable strategy by which this can be implemented. A further contribution of the work is the set of text mining resources created here, including the annotated corpus, and an investigation into the best way to undertake named-entity recognition for the multi-domain concept of bioactivity. It

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P122: Small Molecules as Chemical and Pharmacological Tools for Neuroinflammatory Diseases Treatment (with Emphasis on Multiple Sclerosis)

Multiple Sclerosis (MS) is a neuroinflammatory disease resulting in degeneration of the myelin sheaths and death of oligodendrocytes. So far, several strategies have been introduced to control the disease. Treatment with small molecules is one of the strategies that have recently attracted the attention in the scientific community. These molecules that target epigenetic and other cellular proce...

متن کامل

Identifying Compound-Target Associations by Combining Bioactivity Profile Similarity Search and Public Databases Mining

Molecular target identification is of central importance to drug discovery. Here, we developed a computational approach, named bioactivity profile similarity search (BASS), for associating targets to small molecules by using the known target annotations of related compounds from public databases. To evaluate BASS, a bioactivity profile database was constructed using 4296 compounds that were com...

متن کامل

Introducing the Pattern of Identifying Highly Deprived Areas In Order to Target the System of Jihadist Movements (Case Study: Boushehr Province; Dashty County)

To achieve development in deprived areas, as the ultimate goal of deprivation planning, the first step is to get a real understanding of the status quo and the level of ownership of the areas as a prelude to development. Thus, offering a robust model and model of deprivation, it is important to identify deprivation indicators and related parameters to reduce deprivation. Accordingly, the presen...

متن کامل

Discovery of Novel Glucagon Receptor Antagonists Using Combined Pharmacophore Modeling and Docking

Glucagon and the glucagon receptor are most important molecules control over blood glucose concentrations. These two molecules are very important to studies of type 2 diabetic patients. In literature, several classes of small molecule antagonists of the human glucagon receptor have been reported. Glucagon receptor antagonist could decrease hepatic glucose output and improve glucose control in d...

متن کامل

Discovery of Novel Glucagon Receptor Antagonists Using Combined Pharmacophore Modeling and Docking

Glucagon and the glucagon receptor are most important molecules control over blood glucose concentrations. These two molecules are very important to studies of type 2 diabetic patients. In literature, several classes of small molecule antagonists of the human glucagon receptor have been reported. Glucagon receptor antagonist could decrease hepatic glucose output and improve glucose control in d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015